On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine

نویسندگان

  • David Guennec
  • Damien Lolive
چکیده

Unit selection speech synthesis systems generally rely on target and concatenation costs for selecting the best unit sequence. The role of the concatenation cost is to insure that joining two voice segments will not cause any acoustic artefact to appear. For this task, acoustic distances (MFCC, F0) are typically used but in many cases, this is not enough to prevent concatenation artefacts. Among other strategies, the improvement of corpus covering by favoring units that naturally support well the joining process (vocalic sandwiches) seems to be effective on TTS. In this paper, we investigate if vocalic sandwiches can be used directly in the unit selection engine when the corpus was not created using that principle. First, the sandwich approach is directly transposed in the unit selection engine with a penalty that greatly favors concatenation on sandwich boundaries. Second, a derived fuzzy version is proposed to relax the penalty based on the concatenation cost, with respect to the cost distribution. We show that the sandwich approach, very efficient at the corpus creation step, seems to be inefficient when directly transposed in the unit selection engine. However, we observe that the fuzzy approach enhances synthesis quality, especially on sentences with high concatenation costs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocalic sandwich, a unit designed for unit selection TTS

Unit selection text-to-speech systems currently produce very natural synthetic sentences by concatenating speech segments from a large database. Recently, increasing demand for designing high quality voices with less data creates need for further optimization of the textual corpus recorded by the speaker. The optimization process of this corpus is traditionally guided by the coverage rate of we...

متن کامل

Towards Optimal TTS Corpora

Unit selection text-to-speech systems currently produce very natural synthesized phrases by concatenating speech segments from a large database. Recently, increasing demand for designing high quality voices with less data has created need for further optimization of the textual corpus recorded by the speaker. This corpus is traditionally the result of a condensation process: sentences are selec...

متن کامل

Phonetically Transcribed Speech Corpus Designed for Context Based European Portuguese TTS

This paper presents a speech corpus for European Portuguese (EP), designed for context based text-to-speech (TTS) synthesis systems. The speech corpus is intended for small footprint engines and is composed by one sentence dedicated to each sequence of two phonemes of the language, incorporating as many language contexts as possible at diphone and word levels. The speech corpus is presented in ...

متن کامل

Phonetically enriched labeling in unit selection TTS synthesis

Unit selection techniques have improved the quality of textto-speech (TTS) synthesis. However, mistakes which had been less noticeable previously in poorer quality synthetic speech become very noticeable in more natural-sounding synthetic speech. Many problems appear to be caused by mismatches between phones requested by the TTS frontend and phones selected from the labeled speech inventory. Gi...

متن کامل

Efficient and Scalable Met Generation in Corpus-b

This paper proposes performance indices and search criteria for the text script generation in the design of corpus-based TTS systems. Based on the criteria a new search method is presented to solve the text selection problem more systematically and efficiently. Experiment results have shown that with the same hit rate of unit types the new method can reduce up to 40% of text script size in some...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016